Skip to content

BLD: Build shared c dependencies as a library #47081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

lithomas1
Copy link
Member

@lithomas1 lithomas1 commented May 21, 2022

Does anyone else want to test this? I can only test with -j4 on my old 2-core i3.

@lithomas1 lithomas1 marked this pull request as ready for review May 21, 2022 13:50
@jreback jreback added the Build Library building on various platforms label May 21, 2022
@@ -16,5 +16,4 @@ runs:
python -m pip install -e . --no-build-isolation --no-use-pep517 --no-index
shell: bash -el {0}
env:
# Cannot use parallel compilation on Windows, see https://github.com/pandas-dev/pandas/issues/30873
N_JOBS: ${{ runner.os == 'Windows' && 1 || 2 }}
N_JOBS: 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might still be worth aligning this with the number of core on the GH hosted runners (2 Window/Linux, 3 MacOS): https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources=

@jbrockmendel
Copy link
Member

Does this affect build size or speed or inlining or ...?

@Dr-Irv
Copy link
Contributor

Dr-Irv commented May 23, 2022

I like this PR, and want to try it out, but my laptop where I do 8-core builds on Windows that always fails is out of commission for a week or so. Once I get that back, I will give it a try and let you know the outcome.

setup.py Outdated
],
"include_dirs": [
"pandas/_libs/tslibs/src/datetime",
sysconfig.get_path("include"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these get_path("include") calls? I think the compiler should handle that natively

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, the compiler was not able to find Python.h without this.

@WillAyd
Copy link
Member

WillAyd commented Jun 1, 2022

@lithomas1 do you know what makes the compilation process different for these versus shared libraries created from compiled Cython modules? I'm slightly -1 on adding this complexity and certainly -1 for expanding it beyond these libraries for what seems like a very marginal gain

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Jun 1, 2022

I just tried this PR to build the latest version on my Windows laptop using python setup.py build_ext -j 8 and it worked perfectly. I usually have to do it 2 or 3 times because of the conflicts. So, as the person who created the original issue, I'm +8 (to match my cores) on this!

@lithomas1
Copy link
Member Author

@lithomas1 do you know what makes the compilation process different for these versus shared libraries created from compiled Cython modules? I'm slightly -1 on adding this complexity and certainly -1 for expanding it beyond these libraries for what seems like a very marginal gain

The problem is that there seems to be no way to link against other Extensions(the shared library made from the Cython module) from another Extension. The first option from SO Link is what I'm doing (the other options don't seem to be too relevant, and are also going over my head).

The other benefit of this is that the parser and the tslibs c sources don't get recompiled many times.

@mroeschke
Copy link
Member

Would this potentially help solve #47305?

@WillAyd
Copy link
Member

WillAyd commented Jun 15, 2022

The problem is that there seems to be no way to link against other Extensions(the shared library made from the Cython module) from another Extension. The first option from SO Link is what I'm doing (the other options don't seem to be too relevant, and are also going over my head).

Does the depends keyword argument not help here?

@simonjayhawkins
Copy link
Member

Would this potentially help solve #47305?

the longer #47305 remains unresolved, I think the higher chance of failure when bisecting regressions (but can maybe alter the bisect process if this becomes an issue)

so +1 here if solves #47305

@jreback
Copy link
Contributor

jreback commented Jun 16, 2022

ok let's merge this and see

@lithomas1
Copy link
Member Author

lithomas1 commented Jun 16, 2022

The problem is that there seems to be no way to link against other Extensions(the shared library made from the Cython module) from another Extension. The first option from SO Link is what I'm doing (the other options don't seem to be too relevant, and are also going over my head).

Does the depends keyword argument not help here?

I'm pretty sure I tried this and it didn't work, let me double check.

@WillAyd
Copy link
Member

WillAyd commented Jun 16, 2022

I am fine with this for now if it fixes immediate issues. Can always clean up later. Also exploring different build tooling in #47380 which could help

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Jun 16, 2022

I just tried this PR to build the latest version on my Windows laptop using python setup.py build_ext -j 8 and it worked perfectly. I usually have to do it 2 or 3 times because of the conflicts. So, as the person who created the original issue, I'm +8 (to match my cores) on this!

So I upgraded my Visual Studio installation to Visual Studio 2019, and tried this PR, using python setup.py build_ext -j 8 and now have the original problem that I reported in #30873 . Sigh...

Note that using python setup.py build_ext -j 4 works fine.

Having said that, I think I know what the issue is, which is that there were still references to np_datetime.c that appeared after a merge, so I did edits to the setup.py to fix that, and then it worked. Created a PR lithomas1#16 for @lithomas1 to review and integrate on his repo. Then he can update his branch within this PR.

@@ -16,7 +16,4 @@ runs:
python -m pip install -e . --no-build-isolation --no-use-pep517 --no-index
shell: bash -el {0}
env:
# Cannot use parallel compilation on Windows, see https://github.com/pandas-dev/pandas/issues/30873
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless this address's @simonjayhawkins #47081 (comment) I would prefer to keep this at N_JOBS: 1

Otherwise, I would suggest this be specific to the number of cores available on GHA hosted runners by OS: #47081 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless this address's @simonjayhawkins #47081 (comment)

I've needed to restart a few workflows this last week or so as they fail on the initial build, but it maybe once it is fixed that the initial build is successful, that subsequent incremental builds for commits in the range of the last couple of weeks could also be successful.

And it could also be that if this is a reoccurring problem that I can change the workflow to repeat the previous build step if the pandas import fails.

so I am happy that any comments/concerns are fully addressed before merging this.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Jul 17, 2022

@lithomas1 see my suggested changes in lithomas1#16

@lithomas1
Copy link
Member Author

I haven't touched this PR in a while, since we are considering Meson(working on exploring this front)/Cmake as our new build system. If the other core devs still think this is worthwhile, I am happy to pick it back up though.

@simonjayhawkins
Copy link
Member

I haven't touched this PR in a while, since we are considering Meson(working on exploring this front)/Cmake as our new build system. If the other core devs still think this is worthwhile, I am happy to pick it back up though.

@lithomas1 I'm happy if you are working on different approach instead.

xref #47081 (comment)

I don't recall seeing undefined symbol: pandas_datetime_to_datetimestruct locally or on bisect workflows for a while. Not an issue on CI at the moment either, #47341

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Aug 18, 2022
@lithomas1 lithomas1 removed the Stale label Aug 21, 2022
@lithomas1
Copy link
Member Author

It might be time to revive this. This might fix the CI and I don't think the meson work will be ready before at least the end of this year. cc @WillAyd

@WillAyd
Copy link
Member

WillAyd commented Aug 30, 2022

Hmm I think I'd prefer not to hack setuptools like this versus just using Meson. Stuck with anything in particular on that I might be able to help with?

@WillAyd
Copy link
Member

WillAyd commented Aug 31, 2022

FWIW if you look at the build log this still generates the np_datetime.o file multiple times. Assuming a race condition between the various compilation / linking phases is the root of the issue I'm not sure this solves anything

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Aug 31, 2022

FWIW if you look at the build log this still generates the np_datetime.o file multiple times. Assuming a race condition between the various compilation / linking phases is the root of the issue I'm not sure this solves anything

See my message above: #47081 (comment)

I created a PR for @lithomas1 that fixes this issue.

@WillAyd
Copy link
Member

WillAyd commented Aug 31, 2022

Hmm interesting. It looks like the build log shows that file still getting generated a few times.

Not opposed to merging this as is just don't think this does what we think it does. Just want to be wary of continued setuptools hacks going forward

@lithomas1
Copy link
Member Author

lithomas1 commented Sep 1, 2022

Hmm I think I'd prefer not to hack setuptools like this versus just using Meson. Stuck with anything in particular on that I might be able to help with?

I working out a couple of bugs in meson and meson-python. mesonbuild/meson#10639 is the one I'm looking into right now.

FWIW if you look at the build log this still generates the np_datetime.o file multiple times. Assuming a race condition between the various compilation / linking phases is the root of the issue I'm not sure this solves anything

See my message above: #47081 (comment)

I created a PR for @lithomas1 that fixes this issue.

Merged it, thanks.

Hmm interesting. It looks like the build log shows that file still getting generated a few times.

Not opposed to merging this as is just don't think this does what we think it does. Just want to be wary of continued setuptools hacks going forward

Yeah, this could end up not worth it ifit causes a performance egression somewhere or somehow borks the wheels.

@rhshadrach
Copy link
Member

rhshadrach commented Sep 5, 2022

Around 50% of the time I have to rebuild due to the undefined symbol issue with -j 4. I have a 16 core (32 threads) machine and with this PR running

find . -type f -exec touch {} +
python setup.py build_ext -j 32

three times all builds were successful. Doing the same on main, all three failed. Based on the conversation above, I think that makes me +32 on this PR.

@lithomas1
Copy link
Member Author

Small update on this:

Thanks all for the encouragement on this, and sorry for the long period without updates.
I think I planned to get this in in the meantime before I finished meson work, but it isn't worth it at this point to spend the time investigating perf, and other issues that this may cause.

Work on meson is a mixed bag

  • I can now compile and test on CI on Windows, macOS, and Linux.
  • Sadly, we have failing tests. :(
    - The only issue really worth worrying about is the SAS failures. Looks like there is a problem exporting the correct module name there for the Cython Extension Module.
    - I think some of our tests also depend on pandas being built inplace so which is causing some of the failures.
  • Finally, I have to figure out how to generate the pandas version from the tag. This will fix most of the other failures.

The bad news is I still don't think we'll be able to get meson enabled as the default build system before the end of the year. I've been upstreaming fixes to both the meson and meson-python libraries, and probably is going to be a while before they cut another release.

The good news is that we'll probably have some sort of meson support in main by next month(hopefully?) for brave people to try, it just won't be in the CI until meson and meson-python cut a new release with my fixes in them.

@lithomas1 lithomas1 closed this Sep 24, 2022
@WillAyd
Copy link
Member

WillAyd commented Sep 25, 2022

Finally, I have to figure out how to generate the pandas version from the tag. This will fix most of the other failures.

This is managed today via versioneer, which I think is only supported via setuptools. Might need to look at upstreaming support for meson in that as well

@rgommers
Copy link
Contributor

Re the version, it depends on what you want:

  1. Have an internal version string which contains the commit hash, but a project version (ending up in the wheel filename) of 1.6.0.dev0 without git hash.
  2. Also have the git hash reflected in the wheel version. This is what a number of projects using versioneer do.
  3. Getting the project version directly from a git tag, ala setuptools_scm.

(2) isn't possible easily in Meson right now, because using its vcs_tag feature in the project version would trigger a full reconfigure -see mesonbuild/meson#688 (comment). I'm not sure it would be accepted, although there is recent movement on the issue I linked (including a workaround implementation).

(3) seems to be more feasible.

SciPy used to do (2), and I switched it to (1) during the move to Meson. That has been fine so far. The benefit of having the git hash in the version is relatively minor, and it's still available at install time as python -c "import scipy; print(scipy.__version__)".

@lithomas1
Copy link
Member Author

SciPy used to do (2), and I switched it to (1) during the move to Meson. That has been fine so far. The benefit of having the git hash in the version is relatively minor, and it's still available at install time as python -c "import scipy; print(scipy.__version__)".

I'm a little worried that this will overwrite nightly wheels, though, since the filenames will be the same. This could be a problem if someone downstream is pinning the tests of pandas nightlies.

Is there something we can do in meson-python to workaround this? Since the version in meson has to be hard-coded, it might make sense for meson-python to take an option in the pyproject.toml, to get the version from git?

@rgommers
Copy link
Contributor

rgommers commented Sep 26, 2022

Is there something we can do in meson-python to workaround this? Since the version in meson has to be hard-coded, it might make sense for meson-python to take an option in the pyproject.toml, to get the version from git?

That sounds like a reasonable thing to do in meson-python. I think all that is needed is to (a) actually retrieve the git-based version, and (b) use that in the wheel name and in the METADATA file. That said:

  • this stuff is quite fragile (versioneer seems to be a never-ending source of bugs, it's way over-complicated), and
  • different projects may want to make different choices regarding what the version numbering should be (e.g., just the git hash vs. an incrementing count since the last tag - the latter can be useful for nightlies but problematic with shallow clones). EDIT: also setuptools_scm like behaviour

So perhaps rather than baking it into meson-python itself, there could be some hook mechanism so projects can provide their own function which returns the desired version string. WDYT?

@WillAyd
Copy link
Member

WillAyd commented Sep 26, 2022

Will have to excuse any ignorance in the question but is it possible to continue to use setuptools with meson?

@rgommers
Copy link
Contributor

It's perfectly fine to have two build systems live side-by-side in this repo. SciPy has had both setuptools and meson since last Christmas. The one choice to make is which of the two build systems gets invoked for pip install . - there's a single hook in pyproject.toml, so you cannot make that configurable.

@WillAyd
Copy link
Member

WillAyd commented Sep 26, 2022

If that's the case I would think it's probably easier to just stick with setuptools for the initial cutover then change to meson-Python at a later date. At the very least that way can move forward without having to solve the versioning issue

@WillAyd WillAyd mentioned this pull request Sep 27, 2022
5 tasks
@WillAyd
Copy link
Member

WillAyd commented Sep 27, 2022

Actually re-reading your comment I think we are talking about different things, but moved reply to lithomas1#19 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parallel build_ext might fail due to file collisions
9 participants